Algorithm-based fault-tolerant programming in scientific computation on multiprocessors
نویسندگان
چکیده
EEcient parallel algorithms proposed to solve many fundamental problems in scientiic computation are sensitive to processor failures. Because of its low costs, algorithm-based fault tolerance i s a n i n t e r esting concept for introducing fault tolerance into existing multi-processors. To facilitate fault{tolerant programming in scientiic computation, we have modiied and developed further an existing parallel run{time environment. In this paper the aspect of tuning known error processing techniques to the algorithm{based approach is primarily examined. Design issues for implementation and execution time overhead of a fault{tolerant application in our run{time environment are s t u d i e d. In contrast to many other environments for parallel fault{ tolerant programming, which use the master/slave programming model, our environment enables one to add fault tolerance to existing parallel applications in sci-entiic computation.
منابع مشابه
Fault Tolerant DNA Computing Based on Digital Microfluidic Biochips
Historically, DNA molecules have been known as the building blocks of life, later on in 1994, Leonard Adelman introduced a technique to utilize DNA molecules for a new kind of computation. According to the massive parallelism, huge storage capacity and the ability of using the DNA molecules inside the living tissue, this type of computation is applied in many application areas such as me...
متن کاملVoting Algorithm Based on Adaptive Neuro Fuzzy Inference System for Fault Tolerant Systems
some applications are critical and must designed Fault Tolerant System. Usually Voting Algorithm is one of the principle elements of a Fault Tolerant System. Two kinds of voting algorithm are used in most applications, they are majority voting algorithm and weighted average algorithm these algorithms have some problems. Majority confronts with the problem of threshold limits and voter of weight...
متن کاملVoting Algorithm Based on Adaptive Neuro Fuzzy Inference System for Fault Tolerant Systems
some applications are critical and must designed Fault Tolerant System. Usually Voting Algorithm is one of the principle elements of a Fault Tolerant System. Two kinds of voting algorithm are used in most applications, they are majority voting algorithm and weighted average algorithm these algorithms have some problems. Majority confronts with the problem of threshold limits and voter of weight...
متن کاملFault-Tolerant Matrix Operations for Networks of Workstations Using Diskless Checkpointing
Networks of workstations (NOWs) offer a cost-effective platform for high-performance, long-running parallel computations. However, these computations must be able to tolerate the changing and often faulty nature of NOW environments. We present high-performance implementations of several fault-tolerant algorithms for distributed scientific computing. The fault-tolerance is based on diskless chec...
متن کاملScalable and Fault Tolerant Computation with the Sparse Grid Combination Technique
This paper continues to develop a fault tolerant extension of the sparse grid combination technique recently proposed in [B. Harding and M. Hegland, ANZIAM J., 54 (CTAC2012), pp. C394–C411]. The approach is novel for two reasons, first it provides several levels in which one can exploit parallelism leading towards massively parallel implementations, and second, it provides algorithm-based fault...
متن کامل